consensus definition
Training and Evaluation of Guideline-Based Medical Reasoning in LLMs
Staniek, Michael, Sokolov, Artem, Riezler, Stefan
Machine learning for early prediction in medicine has recently shown breakthrough performance, however, the focus on improving prediction accuracy has led to a neglect of faithful explanations that are required to gain the trust of medical practitioners. The goal of this paper is to teach LLMs to follow medical consensus guidelines step-by-step in their reasoning and prediction process. Since consensus guidelines are ubiquitous in medicine, instantiations of verbalized medical inference rules to electronic health records provide data for fine-tuning LLMs to learn consensus rules and possible exceptions thereof for many medical areas. Consensus rules also enable an automatic evaluation of the model's inference process regarding its derivation correctness (evaluating correct and faithful deduction of a conclusion from given premises) and value correctness (comparing predicted values against real-world measurements). We exemplify our work using the complex Sepsis-3 consensus definition. Our experiments show that small fine-tuned models outperform one-shot learning of considerably larger LLMs that are prompted with the explicit definition and models that are trained on medical texts including consensus definitions. Since fine-tuning on verbalized rule instantiations of a specific medical area yields nearly perfect derivation correctness for rules (and exceptions) on unseen patient data in that area, the bottleneck for early prediction is not out-of-distribution generalization, but the orthogonal problem of generalization into the future by forecasting sparsely and irregularly sampled clinical variables. We show that the latter results can be improved by integrating the output representations of a time series forecasting model with the LLM in a multimodal setup.
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
- Pacific Ocean > North Pacific Ocean > Gulf of Thailand (0.04)
- North America > United States > New Mexico > Bernalillo County > Albuquerque (0.04)
- (7 more...)
Early Prediction of Causes (not Effects) in Healthcare by Long-Term Clinical Time Series Forecasting
Staniek, Michael, Fracarolli, Marius, Hagmann, Michael, Riezler, Stefan
Machine learning for early syndrome diagnosis aims to solve the intricate task of predicting a ground truth label that most often is the outcome (effect) of a medical consensus definition applied to observed clinical measurements (causes), given clinical measurements observed several hours before. Instead of focusing on the prediction of the future effect, we propose to directly predict the causes via time series forecasting (TSF) of clinical variables and determine the effect by applying the gold standard consensus definition to the forecasted values. This method has the invaluable advantage of being straightforwardly interpretable to clinical practitioners, and because model training does not rely on a particular label anymore, the forecasted data can be used to predict any consensus-based label. We exemplify our method by means of long-term TSF with Transformer models, with a focus on accurate prediction of sparse clinical variables involved in the SOFA-based Sepsis-3 definition and the new Simplified Acute Physiology Score (SAPS-II) definition. Our experiments are conducted on two datasets and show that contrary to recent proposals which advocate set function encoders for time series and direct multi-step decoders, best results are achieved by a combination of standard dense encoders with iterative multi-step decoders. The key for success of iterative multi-step decoding can be attributed to its ability to capture cross-variate dependencies and to a student forcing training strategy that teaches the model to rely on its own previous time step predictions for the next time step prediction.
- North America > Canada > Quebec > Montreal (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Europe > Germany (0.04)
- (11 more...)
Validity problems in clinical machine learning by indirect data labeling using consensus definitions
Hagmann, Michael, Schamoni, Shigehiko, Riezler, Stefan
We demonstrate a validity problem of machine learning in the vital application area of disease diagnosis in medicine. It arises when target labels in training data are determined by an indirect measurement, and the fundamental measurements needed to determine this indirect measurement are included in the input data representation. Machine learning models trained on this data will learn nothing else but to exactly reconstruct the known target definition. Such models show perfect performance on similarly constructed test data but will fail catastrophically on real-world examples where the defining fundamental measurements are not or only incompletely available. We present a general procedure allowing identification of problematic datasets and black-box machine learning models trained on them, and exemplify our detection procedure on the task of early prediction of sepsis.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- Europe > Germany (0.05)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (2 more...)
Voices in AI – Episode 65: A Conversation with Luciano Floridi
Today's leading minds talk AI with host Byron Reese Episode 65 of Voices in AI features host Byron Reese and Luciano Floridi discuss ethics, information, AI and government monitoring. They also dig into Luciano's new book "The Fourth Revolution" and ponder how technology will disrupt the job market in the days to come. Luciano Floridi holds multiple degrees including a PhD in philosophy and logic from the University of Warwick. Luciano currently is a professor of philosophy and ethics of information, as well as the director of Digital Ethics Lab at the University of Oxford. Along with his responsibilities as a professor, Luciano is also the chair of the Data Ethics Group at the Alan Turing Institute.
A chat with AI instructor Chris Mohritz (GigaOm)
A chat with AI instructor Chris Mohritz Christopher Mohritz is a lifelong entrepreneur and technologist with a number of successful businesses under his belt; bringing a unique blend of technology know-how coupled with creative thinking and business acumen to each of his projects. Since 2009, Chris has been building and leveraging artificial intelligence systems to cognify a wide range of business functions -- marketing, sales, customer support and decision automation to name a few. And over the past five years, he has been building and operating a business accelerator for web/mobile startups, helping other entrepreneurs launch exceptional "AI-first" businesses. Chris draws heavily from a deep background in technology -- from operating nuclear reactors in the U.S. Navy to designing datacenters at Lockheed Martin. Complemented by a broad range of business experience -- from technical sales for the Fortune 500 to project management in the public sector.
- Government (0.75)
- Information Technology > Services (0.47)